Real-time Reinforcement Learning Control of Dynamic Systems Applied to an Inverted Pendulum

نویسنده

  • W. van Luenen
چکیده

This paper describes work which was started in order to investigate the use of neural networks for application in adaptive or learning control systems. Neural networks have learning capabilities and they can be used to realize non-linear mappings. These are attractive features which could make them useful building blocks for non-linear adaptive or learning controllers. The work can be motivated as follows. For some processes, only ill-defined models are available to the control engineer. Despite this problem, these processes need to be controlled in some way. Neural networks may be useful here because they are able to learn such a model without a preliminary choice of the model structure. After learning the model structure is represented by the structure and (non linear) elements of the neural network. The model parameters are represented by the weight factors of the network. Until now, only well known processes have been used in neural control research. The reason for this is that much of the behavior of neural networks is unknown. Convergence and stability of network learning algorithms have not been proved. The simple problems regarded here are needed to study learning behavior and to interpret the knowledge obtained by learning. This is necessary in order to estimate the complexity of problems which can be solved using neural control. One of the main problems in the neural control field is that neural networks have never been designed or regarded from this point of view. The concepts of neural networks have been developed by scientists from various disciplines. Among them were biologists, physisists, psychologists, mathematicians and so on. After the f i t succesful applications, (control) engineers became interested in neural networks. The historical development of neural network research has resulted in a number of algorithms which have often been developed from a biological point of view. These algorithms are now studied by engineers to investigate their possible use in various technological fields. The work described in this paper concems such an algorithm. The first part of the paper will be concemed with a short introduction. The concepts of adaptive control theory are outlinedand some neural network paradigms will be described which could be useful for control purposes. The field of neural networks is regarded through the eyes of a control engineer. It will be shown that neural networks can be seen as complex multivariable adaptive systems. Simple neural networks can be configured in such a way that adaptive controller configurations appear which are well known. Furthermore the similarities in adaptation rules of adaptive controllers and neural networks are shown. Finally the direct and indirect adaptive control schemes will be introduced in which the neural network paradigms of reinforcement learning and supervised learning (using back propagation) may be used. The second part of the paper consists of a case study about a reinforcement learning control system published by Barto et a1 (1983) and worked out by Anderson (1987). Although not designed for this purpose, it is used for the control of physical dynamical systems. it will be shown that this algorithm can be regarded as a direct adaptive control scheme. The meaning of the algonthm will be reviewed in a control engineering context revealing its attractive and less attractive features. It will be shown that the algorithm is not generally applicable to control problems. However, some minor modifications solve this problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A learning-based control system by knowledge acquisition within constrained environment

Knowledge acquisition is important in order to build a knowledge based system. One of the methods used in acquiring knowledge is reinforcement learning. Reinforcement learning is commonly defined as a try and error style learning that occurred in episodes. This is difficult to ensure a real control object safety condition since a control object is restricted to its environmental constraints. Th...

متن کامل

Inverted Pendulum Control Using Negative Data

   In the training phase of learning algorithms, it is always important to have a suitable training data set. The presence of outliers, noise data, and inappropriate data always affects the performance of existing algorithms. The active learning method (ALM) is one of the powerful tools in soft computing inspired by the computation of the human brain. The operation of this algorithm is complete...

متن کامل

Friction Compensation for Dynamic and Static Models Using Nonlinear Adaptive Optimal Technique

Friction is a nonlinear phenomenon which has destructive effects on performance of control systems. To obviate these effects, friction compensation is an effectual solution. In this paper, an adaptive technique is proposed in order to eliminate limit cycles as one of the undesired behaviors due to presence of friction in control systems which happen frequently. The proposed approach works for n...

متن کامل

Q Learning based Reinforcement Learning Approach to Bipedal Walking Control

Reinforcement learning has been active research area not only in machine learning but also in control engineering, operation research and robotics in recent years. It is a model free learning control method that can solve Markov decision problems. Q-learning is an incremental dynamic programming procedure that determines the optimal policy in a step-by-step manner. It is an online procedure for...

متن کامل

MINIMUM TIME SWING UP AND STABILIZATION OF ROTARY INVERTED PENDULUM USING PULSE STEP CONTROL

This paper proposes an approach for the minimum time swing upof a rotary inverted pendulum. Our rotary inverted pendulum is supported bya pivot arm. The pivot arm rotates in a horizontal plane by means of a servomotor. The opposite end of the arm is instrumented with a joint whose axisis along the radial direction of the motor. A pendulum is suspended at thejoint. The task is to design a contro...

متن کامل

Safe Model-based Reinforcement Learning with Stability Guarantees

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004